Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models
نویسندگان
چکیده
Achieving efficient and scalable exploration in complex domains poses a major challenge in reinforcement learning. While Bayesian and PAC-MDP approaches to the exploration problem offer strong formal guarantees, they are often impractical in higher dimensions due to their reliance on enumerating the state-action space. Hence, exploration in complex domains is often performed with simple epsilon-greedy methods. To achieve more efficient exploration, we develop a method for assigning exploration bonuses based on a concurrently learned model of the system dynamics. By parameterizing our learned model with a neural network, we are able to develop a scalable and efficient approach to exploration bonuses that can be applied to tasks with complex, high-dimensional state spaces. We demonstrate our approach on the task of learning to play Atari games from raw pixel inputs. In this domain, our method offers substantial improvements in exploration efficiency when compared with the standard epsilon greedy approach. As a result of our improved exploration strategy, we are able to achieve state-of-the-art results on several games that pose a major challenge for prior methods.
منابع مشابه
EX2: Exploration with Exemplar Models for Deep Reinforcement Learning
Deep reinforcement learning algorithms have been shown to learn complex tasks using highly general policy classes. However, sparse reward problems remain a significant challenge. Exploration methods based on novelty detection have been particularly successful in such settings but typically require generative or predictive models of the observations, which can be difficult to train when the obse...
متن کاملExploration for Multi-task Reinforcement Learning with Deep Generative Models
Exploration in multi-task reinforcement learning is critical in training agents to deduce the underlying MDP. Many of the existing exploration frameworks such as E, Rmax, Thompson sampling assume a single stationary MDP and are not suitable for system identification in the multi-task setting. We present a novel method to facilitate exploration in multi-task reinforcement learning using deep gen...
متن کاملDeep Reinforcement Learning for De-Novo Drug Design
We propose a novel computational strategy based on deep and reinforcement learning techniques for de-novo design of molecules with desired properties. This strategy integrates two deep neural networks – generative and predictive – that are trained separately but employed jointly to generate novel chemical structures with the desired properties. Generative models are trained to produce chemicall...
متن کاملA Study of Qualitative Knowledge-Based Exploration for Continuous Deep Reinforcement Learning
As an important method to solve sequential decisionmaking problems, reinforcement learning learns the policy of tasks through the interaction with environment. But it has difficulties scaling to largescale problems. One of the reasons is the exploration and exploitation dilemma which may lead to inefficient learning. We present an approach that addresses this shortcoming by introducing qualitat...
متن کاملTowards Cognitive Exploration through Deep Reinforcement Learning for Mobile Robots
Exploration in an unknown environment is the core functionality for mobile robots. Learning-based exploration methods, including convolutional neural networks, provide excellent strategies without human-designed logic for the feature extraction [1]. But the conventional supervised learning algorithms cost lots of efforts on the labeling work of datasets inevitably. Scenes not included in the tr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1507.00814 شماره
صفحات -
تاریخ انتشار 2015